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Layering Enterprise Application Services Using Semantic Firewalls 



Cross-Reference to Related Application 
[0001 J The present application is related to U.S. Provisional Patent Application No. 

60/279,41 0, filed March 29, 2001 entitied 'TLayering Enterprise Applicatioii Services Using 
Semantic iFirewalls'* to David P. Glock et aL, the contents of which are incorporated herein 
by reference in their entirety. 

Background of the Invention 
Field of the Invention 

[0002] The present invention relates generally to secure and efficient computer network 

transactions, and more particularly to firewalls. 

Related Art 

[0003] Many companies use extensible markup language (XML) dialects to encode their e- 

business information models but fail to consider the information assurance aspects of their 
business processes in these models. Usually, information assurance is considered an 
afterthought once the basic enterprise infonnation model is complete. In many cases, 
enterprise apphcations must be rewritten in order to incorporate security, privacy, and 
integrity checks that are outside the scope of the information model but entwined in the 
business process itself. Changing information models and security policies exacerbate the 
problem by forcing designers to develop complex, intertwined solutions that are not scalable 
and are difficult to configure. 

[0004] Currently virtual private networks (VPN), site management solutions, encryption, and 

packet fixewalls are used to relieve application programs firom the burden of handling session 
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management concerns. Most application programs, however, are not concemed with whether 
or not a specific internet protocol. (IP) address is disallowed, or a user is barred from login, or 
oiily certain users can invoke a conamon gateway interface (CGI) script or not Indeed, these 
issues are typically handled using external configuration files and other programs that can be 
dynamically reconfigured without service interruptions. Most of these configuration files and 
support programs can be managed by non-programmers with standard training and 
certification. However, using extemal configuration files is problematic because they are 
typically limited to simple parameter name and values that cannot be used to specify complex 
mles and constraints. 

[0005] IP firewalls and site management tools provide raw access control to uniform resource 

locators (URLs), files, and directories, but role-based access control (RB AC) and task-based 
access control (TBAC) are difficult to integrate into enterprise informatioii models. Current 
packet-based and file-based access control rnodels are not powerful enough to manage access, 
decisions that depend on the data itself and the role of the person(s) viewing and changing the . 
data. 

[0006] For example, FIG. 1 depicts a conventional enterprise information model 100 as a set 

of extensible markup language (XML)-based enterprise apphcations 102a, 102b , and 102c 
(generally 102), such as, e.g., Java servlets, that combine data-dependent access control with 
the enterprise business logic. Each enterprise application 102 must decide on its own what 
data to access, for example from a secure database 106, which clients 108a, 108b, and 108c 
have access to specific data, and at what times the data is valid and accessible. The server 
104 must in tum trust the resident applications 102 to obey the security, privacy^ and integrity 
policies set forth in the business practices of the organization. As one problem with this prior 
art approach, an errant application could produce views or allow edits of sensitive data that 
violate corporate access policies and. standards. 

-2- 
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[0007] FIG. 2 depicts an alternate conventional solution to the system of FIG. 1, where the 

enterprise server 104 may use a security manager tibread 202 to mediate all requests for 
information and manage access* control between the applications 102; The disadvantage of 
this arrangement is that the security manager must be an integral part of the system design, 
not an afterthought, and must be a basic component of the object model of the system. 
Furthermore, management of information and assurance policies must be implemented 
directly in computer source code to be effective. These policies cannot be easily changed 
outside the system by administrative persomiel due to the data-dependent characteristics. 

[0008] The security manager model of FIG. 2 can be an effective approach for e-business 

systems whose infonnation assurance models are well established. But many business 
models are still imdergoing rapid evolution. Major sectors of the new e-business economy 
continue to struggle with complex access control decisions that are data-dependent. 

[0009] Healthcare land fibaancial institutions, for example, cannot afford to use monolithic 

enterprise solutions that tie the institution to one solution, because changmg priorities and 
budgets force the institution to seek outsourced services in competitive markets, such as, e.g., 
application service providers (ASP). Thus, the institutions must rely on open standards to 
quickly integrate new providers, new partners, and new services. However, such standards 
currently do not address infomiation assurance problems, and those institutions must continue 
to rely on costly stove-pipe information technology (IT) solutions, 

[0010] Several projects and a few coimnercial products exist to filter web content between an 

application server and a user agent browser. Most of these tools are focused on either 
hypertext markup language (HTML) filtering or wireless apphcation protocol (WAP) 
transformations. Many of the transformation engines operate as web proxies at the client end 
to enable personalization, privacy, ad filtering, or other user agent fimctions. Only IBM's 
Transcoding Sphere solution begins to address XML-based content filtering but mostly for 
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HTML to WAP transfonnations. 

[001 1] Lutxis' Enhydra XML/Java application server provides capabilities to build CTiterprise 

applications that can accept and produce XML as input and output respectively. The XMLC 
tool of Lutris converts XML files into Java objects (by compiling the XML into a set of 
JAXP invocations to create a document object model (DOM) tree). This conversion allows 
XML site developers to build a programmable web publishing platform in XML and Java. 
The Enhydra Project is docxunented at http://www>enhvdra.org/ . 

[0012] The Apache Cocoon project has a similar architecture for enabling XML-based web 

publishing. The Apache Jakarta project has a subproject called Struts that takes a blackboard 
approach to simplifying the monoUthic architecture of most web appUcation servers like 
J2EE, WebLogic, and BizTalk servers. This subproject strives to solve the problems of 
monolithic web application server architectures through a simplijQed architectural pattern, but 
does not provide the separation of filtering concerns and a rule-based £^proach. 

[0013] Inters Redirector tool is an XML-based content filter that redirects whole XML 

content to web load management servers by determining content types and XML tags.r. 
Redirector does not employ XML schema to make its decisions, but instead miploys tag- 
level decision making to re-route server output only. 

[0014]' The MxifSba Web Proxv. httD://mufan.doit.org/. FilterProxv, 

http ://filterproxv.sourcefQrge.net, and IBM*s Web Intermediaries Project (WBI), 
http://www.aimaden.ibm.com/cs/wbi, (the research tool on which the Transcoding Sphere is 
based) are initial attempts to place content filtering in a web proxy. The WBI tool has a 
demonstration XSLT transformation example, but deals only with fixed style sheet 
transformations based on style sheet processing instructions embedded in the XML content 
itself The prototype semantic firewall is implemented as a filter plug-in in both Muffin and 
WBI. ... 
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[0015] Site management tools from companies like Netegrity and Oblix tend to focus solely 

on URL, directory, and file-based access control to web sites and do not address content 
, filtering issues. While such filtering is essential to complete web security, this filtering is not 

adequate to cover the growing concern for filtering content for authenticated users. 
[0016] What is needed is aii efficient, scalable, easy-to-configure, secure, private way to . 

manage and assure network transactions via analysis of the contents of the data stream itself 

between client and server. 

Summary of the Invention 

[001 7] In an exemplary embodiment of the present invention a system, method and computer 

program product for layering enterprise application services using semantic firewalls is 
disclosed. 

[001 8] In one exemplary embodiment, the present invention can be a system for processing 

data requests from cUents via a network, having an application server coupled to a network, 
the application server providing content from a database to the cUents via the network; and a 
semantic firewall to pass and filter the content between the application server and the clients, 
the semantic firewall restricting access to a portion of the content for at least one client. 

[00 19] In a second exemplary embodiment, the present invention can be a method of 

processing a . data request by a server, comprising the stq)s of receiving a data request from a 
client via a network; retrieving requested data from a database; annotating the requested data 
to obtain annotated data; filtering the amiotated data to obtain filtered data; rendering the 
filtered data to obtain rendered data; and providing the rendered data to the cUent yia the 
network. 
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[0020] Further features and advantages of the invention, as well as the structure and 

operation of various embodiments of the invention, are described in detail below with 
reference to the accompanying drawings. 

Definitions 

[0021] A "computer" refers to any apparatus that is capable of accepting a stractured input, 

processing the structured input according to prescribed rules, and producing results of the 
processing as output. Examples of a computer include: a computer; a general purpose 
computer; a isupercomputer; a mainframe; a super mini-computer; a mini-computer; a 
workstation; a micro-computer; a server; ian iateractive television; a hybrid combiaation of 
computer and an interactive television; and application-specijBc hardware to emulate a 
computer and/or software. A computer can have a siagle processor or multiple processors, 
which can operate in parallel and/or not in parallel. A computer also refers to two or more 
computers connected together via a network for transmitting or receiving information 
between the computers. An example of such a computer iucludes a distributed computer 
system for processing information via cornputers linked by a network. 

[0022] A "computer-readable medium" refers to any storage device used for storing data 

accessible by a computer. Examples of a computer-readable mediimi iaclude: a magnetic 
hard disk; a floppy disk; an optical disk, such as a CD-ROM and a DVD; a magnetic tape; £ 
memory chip; and a carrier wave used to carry computer-readable electronic data, such as 
those used in transmitting and receiving e-mail or ia accessing a network. 

[0023] "Software" refers to prescribed mles to operate a computer. Examples of software 

include: software; code segments; instmctions; computer programs; and programmed logic. 

[0024] A "computer system" refers to a system having a computer, where the comjputer 

comprises a computer-readable medium embodying software to operate the computer. 
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[0025] A **network'* refers to a nuraber of computers and associated devices that are 

connected by communication facilities. A network involves pennanent connections such as 
cables or temporary connections such as those made through telephone or other 
coinmimication links. Examples of a network include: an internet, such as tihie Intemet; an 
intranet; a local area network (LAN); a wide area network (WAN); 3nd a combination of 
networks, such as an intemet and an intranet. 

Brief Description of the Drawings 
[0026] The foregoing and other features and advantages of the invention will be apparent 

from the following, more particular description of a preferred embodiment of the invention, 
as illustrated in the accomp anying drawings wherein like reference numbers generally 
indicate identical, functionally sinadlar, and/or stracturally similar elements. The left-most 
digits in the corresponding reference number indicate the drawing in which an element first 
appears. 

[0027] FIG. 1 depicts a conventional approach to network transaction and security 

management; 

[0028] FIG. 2 depicts another conventional approach to network transaction and security 

management; 

[0029] FIG. 3 depicts an exemplary embodiment of a systraa for network transaction and 

security management according to the present invention; 
[0030] FIG. 4 shows a second exemplary embodiment of a system for network transaction 

and security management according to the present invention; 
[0031] FIG. 5 depicts an exemplary embodiment of a semantic firewall according to the 

present invention; 
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[0032] FIG. 6 depicts an exemplary embodiment of the first stage of annotation according to 

the present invention; 

[0033] FIG. 7 depicts an exemplary embodiment of the second stage of filtering according to 

the present invention; 

[0034] FIG. 8 depicts an exemplary embodiment of the third stage of rendering HTML 

according to the present invention;. . r 

[0035] FIG. 9 depicts an exemplary embodiment of an XML record according to the present 

. invention; 

[0036] FIG. 1 0 depicts an exemplary embodiment of the output of a rule-based 

transformation according to the present invention; 
[0037] FIG. 1 1 depicts an exemplary embodiment of raw semantic firewall rules accotding to 

the present invention; 

[0038] FIG. 12 depicts an exemplary embodiment of an HTML form page for modifying 

rules according to the present invention; 
[0039] FIG. 13 depicts an exemplary embodiment of an XML style sheet according to the 

present invention; 

[0040] FIG. 14 depicts an exemplary embodiment of a final XML output file after the 

application of an XML style sheet according to the present invention; 

[0041] FIG. 15 depicts an exemplary embodiment of a browser view of a transformed XML 

file according to the present.invention; 

[0042] FIG. 1 6 depicts an exemplary query result in XML according to the present invention; 

[0043] FIG. 1 7 depicts the result of a record transformation according to the present 

invention; 

[0044] FIG. 1 8 depicts an aimotated XML file according to the present invention; 

[0045] FIG. 1 9 depicts an exemplary style sheet according to the present invention; 
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[0046] FIG. 20 depicts an exemplary XHTML output of a filtering process of the present 

-invention; and 

[0047] FIG. 21 depicts an exemplary embodiment of a filter, chain according to the present 

invention. 

Detailed Description of an Exemplary Embodiment of the Present Invention 
[0048] A preferred embodiment of the inveation is discussed in detail below. While specific 

exemplary embodiments are discussed, it should be understood that this is done for 
illustration purposes only^ A person skilled in the relevant art will recognize that other 
components and configurations can be used without parting fi-om the spirit and scope of the 
invention. 

[0049] FIG. 3 depicts an exemplary embodiment of a system for network transaction and 

security management accordiag to the present invention. A semantic firewall 3 12, which can. 
be, for example, an XML-based filter, lies outside the core enterprise applications 302a, 
302b, and 302c (generally 302). The semantic firewaU 312 acts as a layer between requests 
for data from clients 308a, 308b, and 308c (generally 308) and the enterprise server 304. The 
clients 308 no longer interact directly with the enterprise applications 302 as iq the 
conventional approaches illustrated ia FIGS. 1 and 2. Instead, the semantic firew^ 3 12 
receives cUent requests and transforms the requests to foims that are appropriate for the role 
and level of access of the client. The transformed forms are then given to the apphcations 
302, which no longer need to be responsible for security or data accessibility restrictions. 
Similarly, data retrieved by an enterprise appUcation 302 from a.sectire database 306 is 
passed through the semantic firewall 312 to the cUents 308, rather than directly to the cUents 
from the application as in the conventional approach. This approach allows the semantic 
firewall to limit the access a chent has to data witiiout having to rely on the application to 
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limit the access. Within the semantic firewall layer there may be additional "layers*' of sub- 
filters that perform sub-tasks within the filtering process. For example, one filter may attach 
personalization infomiation while the next filter uses personalization infomiation and the 
XML data stream to intemationaUze the content 

[0050] FIG. 4 shows a second exemplary embodiment of a system for network transaction 

and security management according to the present invention. In FIG. 4, the semantic firewall . 
312 lies within the enterprise appUcation sever 404, but only to share server resources such 
as, for example, the CPU, files, and communication channels. The semantic firewall 312 of 
FIG. 4 can be the same as semantic firewall 312 in FIG. 3. The semantic firewall 312 ofFIG. 
4 can also run on the same machine as the appUcation server. The semantic firewall 312 can 
also work in conjunction with the existing security manager 402i 

[005 1] In either embodiment of FIG. 3 or FIG. 4, the semantic firewall 3 12 can also work 

with the enterprise appUcation server 304 or 404, respectively, to provide highly re- 
configurable, scalable, data-specific, role-based, and task-dependent access management. 
The semantic fireAvall 312 can be based on software that allows customized, automated form- 
fill for HTML-based forms. The software can be deployed as an intranet or extranet 
application service, or as an Internet consumer service. The semantic firewall software 
allows for login-based access control of information used to fill put web-based forms. Form- 
fill can be viewed as a "filtering*' action on the profile information of a person, which can be 
stored in the secure database 306. For example, the history of form-fill actions for a user can 
be stored in the database. A query for current information to be used to fill out the form is 
performed and returned to the appUcation in use by the cUent. The semantic firewall can 
check the integrity of the information regarding profiles, frequency of use, inter-field 
dependency, and other policy level filtering, encryption, and encodings that are somewhat 
independent of the profile information of the cUent. 

-10- 
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[0052] FIG. 5 depicts an exemplary embodimeut of a semantic firewall 312 according to the 

present invention as illustrated in FIG. 3. Raw XML content is aggregated from the backend 
systems known as a naive application server (NAS) 502. NAS 502 can also be an enterprise 
application server 304. The aggregation of raw XML content can be the result of a query 
from the client, which might be executed in SQL. The NAS 502 interacts directly with the 
data repositories such as, e.g., databases, docimient archives, legacy systems, or secure 
databases 306, which can return the query result, for example, as an XML fragment. The 
NAS 502 is said to be "naive" because it is concerned only with the core business logic that 
processes raw repository requests such as database inserts, updates, selects, and document 
. searches. 

[0053] The sernantic firewall 312 can be built on open, XML-based standards that allow any 

enterprise to focus on their core business logic and data modeling tasks and allows enterprise 
managers to separate information assurance concems outside the core business logic. The 
semantic firewall allows managers to control the security aspects in an easily configurable 
firewall outside the core system. The semantic firewall application program interface (API) 
can be configured with enterprise servers such as, e.g., J2EE, WebLogic, WebSphere, and 
Enhydra, to pre-process and post-process XML content via a simple API for XML (SAX)- 
basedAPL 

[0054] The semantic firewall 3 12 performs a series of filtering operations on, for example, 

XML. content, between the client and server using extensible style language transformations 
(XSLT) that are dynamically generated by a policy constraint rule engine 520. Annotated 
XML schema 512 are used to define the syntax for the XML content, and constraint rules are 
used to perform semantic transformations on tiie XML content that can add, delete, censor, 
encrypt, and decrypt field contents. The semantic firewall 312 can also be configured to log, 
audit, trace, and augment content to and from the client and server. The semantic firewall 
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relies on standards such as, for example, SAML (Secxirity Assertion Markup Language), 
XMLr-Sig pCML Digital Signature standard), and XML-Encryption to perform this filtering. 

[0055] The constraint rules in the constraint rule engine 520 are used to generate dynamic 

XSLT style sheets (not shown) that are used to enforce role-based and task-based access 
control rules that are expressed in XML-based poUcy rules. These policy rules (called 
XRules) can be expressed at higih-semantic levels relative to the schema constructs. By 
treating XSLT style sheets as the "assembly code" of tite transform process, a semantic . 
firewall is easily re-configurable across a wide variety of XML Schema types. This relieves 
the XML portal manager from authoring and managing a large number of XSLT files. 

[0056] The constraint-based approach ofthe invention allows the semantic firewall to be 

easily configured by system administration and management persoiinel and reviewed by 
security poUcy experts. The sertiantic firewall can be configured, and reconfigured, for many 
e-business models, including, for example, healthcare and financial institutions, to express 
complex, data-specific, role-based, and workflow-dependent access rules. Industry-wide 
security standards and governmental laws can be based on such standards as they evolve with 
the core business models as well. 

[0057] The semantic firewall 312 can be configured to add, delete, or transform, for example, 

XML content, to and from the NAS 502. The semantic firewall can be configured to perform 
multi-stage XML and HTML transformations as a server-based HTTP proxy. In an 
exemplary embodunent, the transformation can take place in three stages. 

[0058] Prior to the first stage 510, the semantic firewall 312 can aggregate content fi;om the 

NAS 502 which can produce, for example, various XML files 504, 506, and 508. In the first 
stage 510, the semantic firewall can annotate the XML with new attributes using annotated 
XML schema 5 12. In the second stage 5 14, the semantic firewall can filter the output of the 
first stage based on the attributes added during aiinotation. Finally, in the third stage 516, the 

-12- 
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semantic firewall can apply a dynamically generated XSLT traiisfonnation &om static 
business XSLT style sheets 5 18 to render filtered XML or HTML content- Output firom the 
transformations can include, for example, one or more files, such as an HTML file 526, a 
. simple object access protocol (SOAP) message 528, an XML file 530, or an AuthML file 532 
containing an encrypted message. FIG. 5 shows an XML response from server to client as 
the result of a request, but the request (an XML message) can also be filtered on its way from 
cUent to server. The request, for example, an HTTP request for a URL, can select resources, 
store a file, or initiate a query. 

[0059] The annotations and transformations are generated by high-level rules. For exan^Ie, 

using a CLIPS-like, forward and backward chaining expert system engine, organizational 
policies can be mapped to XML transformations. These rules are e:q)ressed as expert system 
rules, and the transformation is managed by syntax-directed translations relative to the XML 
Schema for the XML content 

[0060] The semantic firewall 312 can also have a session management module 522 that can 

retain state information between subsequent requests and/or responses. For example, 
shopping online is implemented as a series of page requests aiid responses. The "shopping 
baskef ' is the "state" maiutaiued between page requests, that allows the server to reason 
. about ia which step of the process the user is currently located. The semantic firewall can 
also have a rule maintenance module 524 that provides an interface to modify the mles. 

[0061] In an altemative embodiment, the XSLT style sheet itself is not rendered as a 

serialized stream, but rather as a transform object, i.e., a series of templates that represent 
SAX event handlers. 

[0062] FIG. 6 depicts an exemplary embodiment of the first stage 510 of annotation for the 

semantic firewall. First, in block 602, the semantic firewall accepts the raw XML data 604 
firomtheNAS502, for example, m the form offiles 504, 506, and 508. Next, in block 606, 
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the annotated XML schema 512 are applied to the XML data from block 604, The XML 
schema 5 12 are annotated with semantic actions that direct the transformation process. These 
semantic actions generate XSLT in much the same way that a compiler produces machine or 
byte code, but the production is contextually dependent (in most cases) on the input XML 
data itself. The XML content dictates, via a DOCTYPE directive or XML namespace 
attribute, the XML schema or document type dejBnition (DTD) 5 12 to be used for the 
transformation generating process, and parameterizes the generated'XSLT style-sheets. The 
XML schema annotations are parsing actions that are implemented as an extension 
namespace and can also be used to generate facts in ah expert systeni engine, interface with 
document search engines, and other filtering engines. The application of the annotated XML 
schema changes the XML fide 504, 506 or 508 to an annotated intermediate file containing 
more attributes in block 610. The intermediate file is then passed to the second stage of 
filtering in block 612. 

[0063] FIG. 7 depicts an exemplary embodiment of the second stage of filtering 5 14 for the 

semantic firewall 3 12. After the annotated file is accepted from the first stage in step 702, the 
rules engine dynamically generates a style sheet in block 704, using the rules 706. The style 
sheet is then applied to the annotated.file in block 708. The fields in the annotated file are 
filtered when the style sheet removes or hides the fields that the xiser or client shoxxld not see. 
The filtered file is then passed to the third stage of rendering in block 710; 

[0064] FIG. 8 depicts an exemplary embodiment of the third stage 516 of rendering 

transformed XML output according to the present invention. After accepting the filtered file 
in block 802, the fields lq the filtered file are formatted according to static style sheets 518 in 
block 804. The application of the static style sheets can result, for example, in an HTML file, 
which is genera:ted based on the style sheets and the data in block 806 and passed to the 
cUeht. 

-14- 
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[0065] For an exemplary embodiment employing a semantic firewall, consider a medical 

application server provider (ASP) servicing insurance claims over the Internet The medical 
ASP must provide secure access to patient records for hospitals, physician offices, 
pharmacies, and claims agents. Each user must authenticate a session with the ASP system 
using a single sign-in login and password. The medical ASP system uses HTML-based 
forms to grant the user access to patient information based on the role of the user. The user 
may be able to view, change, or add iofprmation based on their role, the current status of a 
task, the sensitivity of the data itself, or a combiuation of factors. The appUcation server 
must store and retrieve medical records to and from a database or collection of databases. 
Many of these databases inay be from legacy systems. The application server must also 
manage session information; determine role permissions, task status, and graphical user 
iaterface (GUI) issues. Changing data permissions within the application logic can be a 
complex task. Policy changes often imply vast architectural changes that can overburden 
small organizations. 

[0066] FIG. 9 depicts an exemplary output file 902, patients.xml, from a query for all patients 

for a particular phj^iciari. Dr. Pat Jones here. The partial record for a single patient is shown 
as a single XML-based patient record 904 within a list of patients. Wifliin the record, there 
can be one or more fields, for example, first name field 906, business phone number field 908 
and follow-up visit date 910. In this example, tiie current user is a receptionist within the 
medical claims provider and is permitted to view only a limited* set of fields in the patient 
record. The receptionist is allowed to view only non-billing iaaformatibn and is able to edit 
only the date of a follow-up visit. 

[0067] ' In this example, the semantic firewall is configured to perform the transformation for 
patient record content in three stages 5 10, 5 14 and 516. In the first stage of annotating, the 
semantic fijrewall accepts the patient record collection shown in FIG, 9 as input and applies 
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rule^based transfonnations to produce the file shown in FIG. 10. Typically, this output is not 
actually rendered, but can be represented as a document object model (DOM) tree or series of 
SAX events in a transformation pipeline. Each field is marked with a new ACCESS 
attribute. , 

[0068] A rule used to transform the file in FIG. 9 is shown in FIG. 1 1 . The rule is: a 

physician who is not the patient's own physician (e.g., another doctor at the hospital) is 
allowed to view billing address information and edit the follbw-up visit date and time. In 
accordance with this rule, field 906 becomes field 1006, having an additional attribute of 
'access = *Vie V, meaning that the field is viewat>le by the user who is a physician. 
Similarly, field 908 becomes field 1008 and is viewable by the user; and field 910 becomes 
field 1010 having the attribute of being editable by the user. The date of the next follow-up 
visit is also incremented by approximately one month, taking into account hoUdays and 
weekends; The remaining fields in 904 are similarly processed based on the rules. 

[0069] An example of a raw CLIPS rule, i.e. the textual computer program, used to create the 

file in FIG. 10 is shown in FIG. 1 1. The rule mentioned above is expressed as a CLIPS mle 
in the system and is used to guide the transformation process of the XML content produced 
by the NAS. A rule consists of conditions on the lefl:-hand-side (LHS) of the "=>" symbol, 
and actions on the right-hand-side (RHS) of the symbol. XML content is processed into 
a tuple space. If all conditions match on the LHS of a rule, the rule "fires", and the actions on 
the RHS are performed. For example, lines 1 104, 1 106, 1 ICS, and 1110 represent patterns 
within the condition of a rule (called rule6). Each pattern contains fixed content or variables. 
The elements NAME, POSITION and Physician in Une 1 1 06 are fixed, while the term 
"?ename" is a variable. Variables match any fixed content of a corresponding tuple in the 
tuple space (such as the tuple "(EMPLOYEES-EMPLOYEE (NAME Fred) (POSITION 
Physician))" firom the XML content scanned into the system). Some variables can match 
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none, one, or any number of terms in a tuple. For example, the variable $?rules in line 1110 
matches the list of rules tiiat are currently active. Named variables are bound to their values 
on the entire LHS of a mle. For example, if line 1 104 matches the tuple "(SES SIONS- 
SESSION (NAME Fred))", line 1 106 must match the tuple ^'(BMPLOYEES-EMPLOYEE 
(NAME Fred) (POSITION Physician))'' in the tuple space. If such a tuple does not exist, the 
entire rule fails to fire because it does not apply. Thus, the rule in FIG. 1 1 is inteipreted as 
"for the current logged-in user whose name is ?ename (line 1 104), and who is a Physician 
(line 1 1 06), and not the patient' s assigned physician ("-?ename" means "NOT EQUAL to 
?ename" in line 1 108), and not yet under appUcation of this rale (line 1 110) (this rule cannot 
be apphed more than once), set the ACCESS attribute to VIEW (line 1 1 12) in the current 
patient for the fields PID, FNAME, LNAMB, B ADDRESS, BCITY, ESTATE, BZIP, 
BPHONE, and LASTVISIT (line 1 1 14)." Line 1116 creates and stores the defimtion of rule6 
in the rules for patients. Other rules (not shown) mark the ACCESS attribute to **none" or 
"edit* ' as their conditions dictate, while still other rules delete XML nodes during the 
transfomiation firom FIG. 9 to the content in FIG. 10 such as INSURER, INSNUM, 
DOCTOR, VISITTIME, PURPOSE, SEENBY, and DL^GNOSIS. 
[0070] The rules themselves can be maintained and changed via an HTML fomi page such as 

the form shown in FIG. 12. The rule maintenance page 1202 shows the rule number 1204 
and the rule description 1206. The rule setter, for example, a manager or a system 
administrator, can alter parameters of the rule logic 1208. For each field of data, the rule 
setter may choose rule attributes. Ih this example, the choices are whether the field is 
viewable, not viewable or editable. Jn this manner, a set of rules for a semantic firewall can 
be configured and managed by non-programmers without service interruption. The rule 
maintenance page itself can be automatically generated by the system, or written by the rule 
V auttior. 
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[0071] In the second stage of filtering by the senaantic firewall 3 12 fpr this example, a style 

sheet 1302 as shown in FIG. 13 is produced dynanoically. The style sheet is generated with 
information for the current user. This XSLT style sheet enforces the semantics of the newly 
added attributes by deleting certain fields firom the XML content. For example, line 1304 
copies all of a patient record. Line 1306 copies all fields with the ACCESS attribute equal to 
'view'. Similarly, line 1308 copies all fields with the * edit' attribute. Line 1310 matches and 
deletes any field whose ACCESS attribute is equal to *none' (tneaning not viewable or 
editable), removing that field fi-om the viewable data. 

[0072] The result of applying the style sheet of FIG. 13 to the annotated file of FIG. 10 is 

shown in FIG. 14. In FIG. 14, only the viewable fields of the patient record 1404, such as 
1406 and 1408, and editable fields, such as 1410, firom FIG. 10 remain. 

[0073] In the third stage of rendering by the semantic firewall 312 for this example, a static 

style sheet is applied to transform the final XML into HTML for presentation by the server or 
by a client, such as, for example, a user agent or browser. The style sheet can be applied by 
die semantic firewall^ a redirecting server, or the browser itself- In the latter case, it proves 
valuable for the firewall to have eliminated XML content before sending the XML content to 
the browser. Eliminating the XML content before sending the XML content to the brpwser 
prevents secure, data firom being sent to the browser and intercepted before being filtered by 
the client transform. All filterimg of secure information is best done on the server before 
sending it to the client. 

[0074] The rendered HTML is shown in FIG. 15, which shows a browser 1502 view of the 

file m FIG. 14. Fields 1 506 and 1 508, which correspond to the original fields 906 and 908, 
respectively, are viewable, but the user cannot modify the values. Only the last field 1510, 
"FoUowup", which corresponds to original field 910, can be edited by the user. 

[0075] For anottier exemplary embodiment employing a semantic firewall, consider the same 
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medical ASP as in the previous example. In response to a SQL query from a physician such 
as: "select id, status from patient_tests as xml", the back-end. database generates an XML 
fragment as output, as shown in FIG. 16. The fragment 1602 contains a series of records 
1604, each having the ID 1606 and status 1608 of thepatient from the patient tests! This 
output XML 1602 from the query is passed to the fibrst stage as input. 
[0076] FIG. 17 shows the added session context information from the first stage, which, here, 

are the nkme 1702 and role 1704 of the current authenticated user making the request The 
name 1702 and role 1704 are attributes in the top-level tag 1706. The first stage also 
transforms each record element by looking up tiie patient identifi^ in ah LDAP database 
along with the name of the doctor of the patient. The ID tag 1606 is eliminated, and the name 
1708 and doctor 1710 tags are added to produce the XML firagment in FIG. 17 as ouiput from 
the first stage. 

[0077] The XML fragment shown in FIG. 17 is passed as input to the second stage. The 

second stage applies complex security rules in order to transform the input by adding, 
deleting, or changiug tags, attributes, nodes, and node content. In this example, the second 
stage adds, a view attribute to each record to indicate which records should be showii or 
hidden by the next stage in the pipeline. The second stage adds the view attribute to the status 
tag of each record based on two rules: Rule 1- All physicians can see the list of patient 
records; Rule 2- Only the physician of the patient can view a medical test record for 
patient. 

[0078] Based on these rules, the second stage produces the XML fragment 1802 shown in 

FIG. 18 as output. The status element 1804 of the first record 1806, the test result for Smith, 
can be Adewed by the currCTit user, Jones, because (1) Jones is a physician and can view all 
the records according to Rule I, and (2) Jones is the physician for Smith in accord with Rule 
2. The view attribute of the status element 1808 of the second record 1810, which is the test 
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..result for patipit Morgan, is marked "false*' because even though Jones is a physician, Jones 
is not the physician for Morgan. 
[0079] The XML fragment shown in FIG. 1 8 iis passed as input to the third stage, where it is 

stylized based on liie XML content produced as output from the second stage. The third stage 
uses an XSLT style sheet to transfomi the XML content into extensible hypertext markup 
language (XHTML). The XSLT style sheet uses the role attribute of the top-level tag, here 
"physician," and the view attributes on the status elements to transform the XML content into 
the appropriate XHTML for presentation in the browser of a requesting client. The XSLT 
templates 1902 and 1904 shown in FIG. 19 are part of an XSLT style sheet used to transform 
the content as appropriate based on the view attributes 1906 and 1908 of the status element 
for each record. 

[0080] The fragment shown in FIG. 20 is exemplary final XHTML produced as output from 

the third stage. The resulting XHTML is sent to the requesting client in the body of an HTTP 
response. The third stage is implemented within the semantic firewall, which can be within a 
firewall server or implemented as a post-processing stage on a Web server. If the third stage 
is implemented as a post-processing stage, the styUzation does not occur within the browser. 

[0081] . This example of a semantic firewall configuration illustrates the filtering of an 

outgoing XML response. Incoming XML documents and GET/POST variables can also be 
transformed by a series of filters within a semantic firewall. Elements of an HTTP GET or 
POST request (e.g., header and form elements) can easily be encoded within XML and 
filtered before query processing. 

[0082] In an alternate embodiment, the style sheets do not depend on data content but rather 

on policy rules only. In the example presented above, the generated style sheet, in FIG. 13, 
for example, would be different for the different roles of the logged in user. The style sheets 
can be cached and regenerated on demand if needed. Likewise, fixed content from other 
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backend systems (such as the roles of system users) can also be cached in the semantic 
firewall itself instead of being fetched and re-fetched for each filtering pipeline. 

[0083] . Hie semantic firewall is re-configurable in ways similar to traditional TP firewalls. 

However, filter chains enable non-programmers to compose pipelines of filters that are 
activated xmder various conditions. Similar to UNIX pipes and IP chains, filter chains can be 
composed together so that the output of one filter is coimected to the input of another in a 
series using a terse configuration languajge. Unlike pipes arid IP chains, however, filter 
chains can insert or retract conditions that activate other filter chains that can, in turn, 
redirect, clone, initiate or terminate chain activations. Whereas each filter can be 
implemented as a thread, each chain is also impleniented as a thread of control in the 
semantic firewall process. 

[0084] High-performance can be achieved through the use of caching, pre-parsing, pre- 

fetching, and primarily using a pipehne of SAX transformations. Transformation objects 
using the Transformation API for XML (TrAX) can be pipehned together to fonn a chain of 
transformations that feed events efficiently down a chain of tag handlers. With respect to 
FIG, 5, pipelining would eliminate transforming the output of a stage into XML and then 
back into an intemal format between each stage. Instead, with pipelining, the output firom a 
stage can be input directly into the next stage without having to transform the data. Similar to 
UNIX pipes, these transfiDrmers can be implemented as separate Java threads. Each thread 
does not have to wait vmtil the previous thread in the pipehne completes. 

[0085] For example, consider a set of three filter chains in which any incoming request must 

first be authenticated in order to be handled by any firewall filters. FIG. 21 shows an 
exemplary configuration file for the semantic firewall with three filter chains; The lefl;-hand 
sides 2102, 2104, and 21 10 of the filter chain rules 21 14, 2116 and 2118, respectively, 
represent conditions and events, while the right-hand sides 2108, 2106, and 21 12, 
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respectively, of the filter chain rules represent a series of input-output filters. The first chain 
2 1 14 is triggered on any incoming HTTP request for any docmnent type or URL. The 
authenticate filter 2102 is the first filter to be invoked. If the authentication is successful, the 
authenticated condition 2104 (i.e., event) is introduced. In the case of the OUT event 21 10, 
the backend has produced content (in HTML, XML, etc.) to be processed by the semantic 
firewall. In this case, the HTTP response is dispatched to the appropriate filters (other filter 
chains not shown) or tiie content is simply passed through the identify filter (cdied 
^passthru') 2112. This triggers the second filter chain 2116 to dispatch 2106, an HTTP 
request to the appropria.te fiilters. If authentication fails ia.t filter 2102, the failed 
authentication condition is piped to the "autherror^* filter 2108. A chain can be terminated 
prematurely and introduce other conditions and events or proceed along the qhai^ 

[0086] The inventive semantic firewall can be used in a number of applications and in 

different configurations, for example, for document routing and content management Search 
engiae technologies such as the JHU/APL HAIRCUT engine can be adapted to watch 
incoming and outgoing docmnents that pass through the semantic firewall. Incoming 
documents can be tagged and classified while outgoing documents can be aimotated with 
links to related documents at the document, paragraph, and word levels. 

[0087] The inventive semantic firewall can be xised for enterprise form fiU^ The swifllD 

product of Sphere Software Corp. can be included in the semantic fiarewall to recognize web 
form field names automatically, translate the form field names into their semantic 
equivalents, and lookup profile infonnation to fill-in form fields. Rules can be used to 
introduce state-dependent field contents and implement inter-field dependencies, for 
example, credit card and expiration date form fields for a transaction can change when either 
field is changed. 
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[0088] The inventive semantic firewall can be used as a.Model 2 presentation paradigm. The 

. semantic firewall can hold state information cpnceroing the model- view-controller 
dependencies for backend application servers that have yet to implement the Model 2 
paradigm. Holding the state, information can enable migration of existing web appUcations 
towards the Model 2 approach. 

[0089] The inventive semantic firewall can be used for encryption and authentication. The 

semantic firewall can be used to wrap an existing web site with an authentication shell as well 
as encrypt specific tags in XML content. Dynamically generated XSLT style sheets can 
introduce Javascript.CDATA elements (via xsLscript elements) that can prompt ttie user for 
the private key passphrase of the user to decrypt a field 

[0090] The inventive semantic firewall can be iised for a semantic wireless Web, The 

semantic firewall can be used to wr^ a web site with rule-based transforms to VoiceML, 
WAP, and WML output using dynamically generated XSLT style sheets. 

[0091] The inventive semantic firewall can be usied in semantic auditing. Machine learning, 

neural network, and other artificial inteUigence (AT) technologies can be used to watch XML 
content traffic at the tag level and log activities. 

[0092] The inventive semantic firewall can be used for portal management Portal sites 

provide an ability to customize page presentation to include news, discussions, and other 
content management capabilities to non-programmers/ The semantic firewall can be used to 

i 

allow authoring of rules directly to the end user. An end user can put together filter chains to 
improve the richness of the portal page behaviors on the portal page that the end user sets up. 
[0093] The inventive semantic firewall can be used in legacy database integration. The 

semantic firewall can be used to wrap a plain, backend legacy database with a semantic 
firewall that generates simple object access protocol (SOAP) wrappers as an access point to 
the legacy system. Since SOAP implements remote procedure call (RPC) over HTTP, the 
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semantic jSrewall filters the incoming SOAP requests and transforms the requests into 
database access calls. 

[0094] The inventive semantic firewall can be used for address validation. For incoming 

HTTP POST requests, the meaning of a forai field can be guessed, populated with existing 
data firom previous transactions, and validated with ASP services such as, for example, 
Centris by Sagent. Iricomiag product registrations and other use data can be checked in the 
semantic firewall before accessing the backend database and web application server. 

[0095] The inventive semantic firewall can be used as a financial firewall. The semantic 

, firewall can filter, audit, and limit by amount or role and task access controls XML content 
with financial data. High-level rules can control access based on field specific data, client 
confidentiality, role-based permissions, and budget levels. 

[0096] In an exemplary embodiment, the application server and sOTiantic firewall can be 

implemented separately or in combination by one or more computer systems. 

[0097] In an exemplary embodiment, the software to implement the application server and 

semantic firewall can be stored on one or more computer-readable media. 

[0098] Although the current invention has been described with respect to XML data types. 

The invention can be implemented in other computer languages and can employ other data 
types. 

[0099] The embodiments and examples discussed herein are non-limiting examples. 

[00100] While various embodiments of the present invention have been described above, it 

should be understood that they have been presented by way of example only, and not 
limitation. Thus, the breadth and scope of the present invention should not be limited by any 
of the above-described exemplary embodiments, but should instead be defined only in 
accordance with the following claims and their equivalents. 
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Claims. 

Witai is claimed is: , ^ 

5 1 . A system for processing data requests from clients via a network, comprising: 

an application server coupled to said network, said application server providing 
content from a database to said clients via said network; and 

a semantic firewall to pass and filter said content between said application server and 
said clients, said semantic firewall restricting access to a portion of said content for at least 
10 one client. 

2. A system as in claim. 1, wherein said semantic firewall annotates content from said 
database to obtain annotated data, filters said annotated data to obtain filtered data, and 
renders said filtered data to obtain rendered data. 



15 



20 



3. A system as in claim 1, wherein said semantic firewall comprises: 
means for annotating content from said database to obtain annotated data; 
means for filtering said annotated data to obtain filtered data; and 
means for rendering said filtered data to obtain rendered data; 

4. A system as in claim 3, wherein said means for annotating employs at least one 
: annotated scheme and a rule-based transformation to obtain said annotated data. 



5. A system as in claim 3, wherein said means for filtering employs at least one rule 
25 and a rule engine to obtain said filtered data. 
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6. A system as in claim 3, wherein said means for rendering employs at least one 
static style sheet to obtain said rendered data. 

7. A system as in claim 1, wherein said semantic firewall is exterior to said 
application server.; . 

8. A system as in claim 1, wherein said semantic firewall is interior to said 
appUcation server. 

9. A method of processing a data request by a server, comprising the steps of: 

receiving said data request from a client via a network; 
retrieving requested data fi-om a database; 
annotating said requested data to obtain annotated data; 
filtering said aimotated data to obtain filtered dataj 
rendering said filtered data to obtain rendered data; and 
providing said rendered data to said client via said network. 

10. The method of claim 9, wherein the step of annotating comprises the steps of: 
accessing a rules file, wherein said rules file defines rules for accessing data; and 
annotating said requested data based on said mles in said rules file to obtain said 

annotated data. 

11. The method of claim 9, wherein the step of filtering comprises the steps of: 
creating a style sheet dynamically; and 
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applying said style. sheet to said annotated data, thereby filtering said annotated data 
to obtain said filtered data. 



12. The method of claim 9, whaein the step of rendering comprises the steps of: 
accessing a static style sheet; and , 
applying said static style sheet to said filtered data, thdreby generating said rendered 

data. 

13. The method of claim 9, wherein said requested data is in extensible mark^^ 
language (XML), and said rendered data is in hypertext markup language (HTML). 

14. A computer system for performing the method of claim 9, 

15. A computer-readable mediurn haying software for performing the niethod of 
claim 9. 
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